多状态生存分析(MSA)使用多状态模型来分析事件时间数据。在医疗应用中,MSA可以提供有关患者复杂疾病进展的见解。 MSA中的一个关键挑战是对多状态模型数量(例如过渡概率和状态职业概率)的精确预测。传统的多状态方法,例如Aalen-Johansen(AJ)估计量和基于COX的方法,分别受马尔可夫和比例危害假设的限制,并且对于做出特定于主题的预测而言是不可行的。 MSA的神经普通微分方程放宽了这些假设,但在计算上很昂贵,并且不会直接建模过渡概率。为了解决这些局限性,我们提出了一类新的基于伪值的深度学习模型,用于多态生存分析,我们表明,旨在处理审查的伪值 - 可以自然替代多国家模型源自一致的估计器时的数量。特别是,我们提供了一种算法来从一致的估计器中得出伪值,以直接预测受试者协变量的多状态生存量。合成和现实世界数据集的经验结果表明,我们提出的模型在各种审查设置下实现了最新的结果。
translated by 谷歌翻译
生存分析(事件时间分析)是医疗保健中的重要问题,因为它对患者和姑息治疗产生了广泛的影响。许多生存分析方法都认为,生存数据是从一个医疗中心或通过多中心的数据共享中心可获得的。但是,患者属性和严格的隐私法的敏感性越来越多地禁止分享医疗保健数据。为了应对这一挑战,研究界研究了使用联合学习(FL)范式分散培训和模型参数共享的解决方案。在本文中,我们研究了FL对分布式医疗保健数据集进行生存分析的利用。最近,流行的COX比例危害(CPH)模型已适用于FL设置。然而,由于其线性性和比例危害假设,CPH模型会导致次优性能,尤其是对于非线性,非IID和经过严格审查的生存数据集。为了克服现有联合生存分析方法的挑战,我们利用深度学习模型的预测准确性和伪价值的力量来提出一种首个基于伪价值的基于伪价值的深度学习模型(用于联合生存分析)( FSA)称为FedPseudo。此外,我们引入了一种新的方法,该方法是在FL设置中得出伪值的伪值,从而加快了伪值的计算。关于合成和现实世界数据集的广泛实验表明,我们的基于伪有价值的FL框架的性能与最佳中心训练的深层生存分析模型相似。此外,我们提出的FL方法为各种审查设置获得了最佳结果。
translated by 谷歌翻译
Multivariate time series data in practical applications, such as health care, geoscience, and biology, are characterized by a variety of missing values. In time series prediction and other related tasks, it has been noted that missing values and their missing patterns are often correlated with the target labels, a.k.a., informative missingness. There is very limited work on exploiting the missing patterns for effective imputation and improving prediction performance. In this paper, we develop novel deep learning models, namely GRU-D, as one of the early attempts. GRU-D is based on Gated Recurrent Unit (GRU), a state-of-the-art recurrent neural network. It takes two representations of missing patterns, i.e., masking and time interval, and effectively incorporates them into a deep model architecture so that it not only captures the long-term temporal dependencies in time series, but also utilizes the missing patterns to achieve better prediction results. Experiments of time series classification tasks on real-world clinical datasets (MIMIC-III, PhysioNet) and synthetic datasets demonstrate that our models achieve state-of-the-art performance and provides useful insights for better understanding and utilization of missing values in time series analysis.
translated by 谷歌翻译
Data compression is becoming critical for storing scientific data because many scientific applications need to store large amounts of data and post process this data for scientific discovery. Unlike image and video compression algorithms that limit errors to primary data, scientists require compression techniques that accurately preserve derived quantities of interest (QoIs). This paper presents a physics-informed compression technique implemented as an end-to-end, scalable, GPU-based pipeline for data compression that addresses this requirement. Our hybrid compression technique combines machine learning techniques and standard compression methods. Specifically, we combine an autoencoder, an error-bounded lossy compressor to provide guarantees on raw data error, and a constraint satisfaction post-processing step to preserve the QoIs within a minimal error (generally less than floating point error). The effectiveness of the data compression pipeline is demonstrated by compressing nuclear fusion simulation data generated by a large-scale fusion code, XGC, which produces hundreds of terabytes of data in a single day. Our approach works within the ADIOS framework and results in compression by a factor of more than 150 while requiring only a few percent of the computational resources necessary for generating the data, making the overall approach highly effective for practical scenarios.
translated by 谷歌翻译
We propose a novel model agnostic data-driven reliability analysis framework for time-dependent reliability analysis. The proposed approach -- referred to as MAntRA -- combines interpretable machine learning, Bayesian statistics, and identifying stochastic dynamic equation to evaluate reliability of stochastically-excited dynamical systems for which the governing physics is \textit{apriori} unknown. A two-stage approach is adopted: in the first stage, an efficient variational Bayesian equation discovery algorithm is developed to determine the governing physics of an underlying stochastic differential equation (SDE) from measured output data. The developed algorithm is efficient and accounts for epistemic uncertainty due to limited and noisy data, and aleatoric uncertainty because of environmental effect and external excitation. In the second stage, the discovered SDE is solved using a stochastic integration scheme and the probability failure is computed. The efficacy of the proposed approach is illustrated on three numerical examples. The results obtained indicate the possible application of the proposed approach for reliability analysis of in-situ and heritage structures from on-site measurements.
translated by 谷歌翻译
We present DyFOS, an active perception method that Dynamically Finds Optimal States to minimize localization error while avoiding obstacles and occlusions. We consider the scenario where a ground target without any exteroceptive sensors must rely on an aerial observer for pose and uncertainty estimates to localize itself along an obstacle-filled path. The observer uses a downward-facing camera to estimate the target's pose and uncertainty. However, the pose uncertainty is a function of the states of the observer, target, and surrounding environment. To find an optimal state that minimizes the target's localization uncertainty, DyFOS uses a localization error prediction pipeline in an optimization search. Given the states mentioned above, the pipeline predicts the target's localization uncertainty with the help of a trained, complex state-dependent sensor measurement model (which is a probabilistic neural network in our case). Our pipeline also predicts target occlusion and obstacle collision to remove undesirable observer states. The output of the optimization search is an optimal observer state that minimizes target localization uncertainty while avoiding occlusion and collision. We evaluate the proposed method using numerical and simulated (Gazebo) experiments. Our results show that DyFOS is almost 100x faster than yet as good as brute force. Furthermore, DyFOS yielded lower localization errors than random and heuristic searches.
translated by 谷歌翻译
Adversarial training has been empirically shown to be more prone to overfitting than standard training. The exact underlying reasons still need to be fully understood. In this paper, we identify one cause of overfitting related to current practices of generating adversarial samples from misclassified samples. To address this, we propose an alternative approach that leverages the misclassified samples to mitigate the overfitting problem. We show that our approach achieves better generalization while having comparable robustness to state-of-the-art adversarial training methods on a wide range of computer vision, natural language processing, and tabular tasks.
translated by 谷歌翻译
Adversarial training is widely acknowledged as the most effective defense against adversarial attacks. However, it is also well established that achieving both robustness and generalization in adversarially trained models involves a trade-off. The goal of this work is to provide an in depth comparison of different approaches for adversarial training in language models. Specifically, we study the effect of pre-training data augmentation as well as training time input perturbations vs. embedding space perturbations on the robustness and generalization of BERT-like language models. Our findings suggest that better robustness can be achieved by pre-training data augmentation or by training with input space perturbation. However, training with embedding space perturbation significantly improves generalization. A linguistic correlation analysis of neurons of the learned models reveal that the improved generalization is due to `more specialized' neurons. To the best of our knowledge, this is the first work to carry out a deep qualitative analysis of different methods of generating adversarial examples in adversarial training of language models.
translated by 谷歌翻译
Millions of people participate in online peer-to-peer support sessions, yet there has been little prior research on systematic psychology-based evaluations of fine-grained peer-counselor behavior in relation to client satisfaction. This paper seeks to bridge this gap by mapping peer-counselor chat-messages to motivational interviewing (MI) techniques. We annotate 14,797 utterances from 734 chat conversations using 17 MI techniques and introduce four new interviewing codes such as chit-chat and inappropriate to account for the unique conversational patterns observed on online platforms. We automate the process of labeling peer-counselor responses to MI techniques by fine-tuning large domain-specific language models and then use these automated measures to investigate the behavior of the peer counselors via correlational studies. Specifically, we study the impact of MI techniques on the conversation ratings to investigate the techniques that predict clients' satisfaction with their counseling sessions. When counselors use techniques such as reflection and affirmation, clients are more satisfied. Examining volunteer counselors' change in usage of techniques suggest that counselors learn to use more introduction and open questions as they gain experience. This work provides a deeper understanding of the use of motivational interviewing techniques on peer-to-peer counselor platforms and sheds light on how to build better training programs for volunteer counselors on online platforms.
translated by 谷歌翻译
Achieving knowledge sharing within an artificial swarm system could lead to significant development in autonomous multiagent and robotic systems research and realize collective intelligence. However, this is difficult to achieve since there is no generic framework to transfer skills between agents other than a query-response-based approach. Moreover, natural living systems have a "forgetfulness" property for everything they learn. Analyzing such ephemeral nature (temporal memory properties of new knowledge gained) in artificial systems has never been studied in the literature. We propose a behavior tree-based framework to realize a query-response mechanism for transferring skills encoded as the condition-action control sub-flow of that portion of the knowledge between agents to fill this gap. We simulate a multiagent group with different initial knowledge on a foraging mission. While performing basic operations, each robot queries other robots to respond to an unknown condition. The responding robot shares the control actions by sharing a portion of the behavior tree that addresses the queries. Specifically, we investigate the ephemeral nature of the new knowledge gained through such a framework, where the knowledge gained by the agent is either limited due to memory or is forgotten over time. Our investigations show that knowledge grows proportionally with the duration of remembrance, which is trivial. However, we found minimal impact on knowledge growth due to memory. We compare these cases against a baseline that involved full knowledge pre-coded on all agents. We found that knowledge-sharing strived to match the baseline condition by sharing and achieving knowledge growth as a collective system.
translated by 谷歌翻译